Release 10.1A: OpenEdge Development:
Messaging and ESB
Publishing, subscribing, and receiving an XML document in a BytesMessage
The procedures example12.p and example13.p (1 of 2) use a
MEMPTRvariable to publish and receive an XML document in aBytesMessageto prevent code-page conversions. The code pages of the document and the 4GL client do not have to match.
![]()
To run example12.p and example13.p:
- Run example13.p (1 of 2) to subscribe and receive a
BytesMessagecontaining an XML document, as shown:
- Run example12.p to publish the
BytesMessagecontaining an XML document, as shown:
XML code page encoding
4GL applications work with the built-in XML parser. It is important to consider the code page encoding of XML messages. In principle, XML documents can be encoded with any code page. However, XML parsers support some or all code pages, and XML parsers also differ with respect to the code page conversions that they support.
4GL clients set and get XML text using the 4GL
CHARACTERdata type.CHARACTERdata is encoded by the 4GL interpreter according to the internal code page (the-cpinternalstartup parameter). The 4GL-JMS implementation automatically converts the text to Unicode when it is sent to the JMS server, and from Unicode to the internal client’s code page when the text is sent from the server to the client.In general, when the characters used by the XML document are from the 7-byte ASCII subset, there are no issues the 4GL programmer has to consider. Otherwise, observe the following examples and guidelines in the following examples.
Code page example 1
In this example, two 4GL clients use the ISO8859-1 code page:
- Client1 sets XML text in an
XMLMessageand sends it.- Client2 receives the message, extracts the text, stores it in a
MEMPTRvariable, and creates an XML document. (See the "Publishing, subscribing, and receiving an XML document in a BytesMessage" section.)The following code-page conversions take place:
In this example, the XML parser parses the XML document correctly if:
Code page example 2
In this example, two 4GL clients use ISO8859–1 for their internal code page. Client1 saves a UTF–8 encoded XML document in a
MEMPTRvariable (calling theX–DOC:SAVE()4GL method) and then uses the 4GLGET–STRINGstatement to extract the text from theMEMPTRand pass it into theXMLMessage. (This is a deliberate error.) UTF–8 (Unicode Transformation Format) is an 8-bit encoding form that serializes a Unicode scalar value as a sequence of one to four bytes.A 4GL client cannot mix code pages. The text it sets in the
XMLMessagemust be encoded in the same code page as the client’s internal code page. In general, aMEMPTRvariable must be used carefully, since it can have any data in it. The 4GL programmer must be sure that it contains onlyNULLfree text (no embeddedNULLbytes), encoded with the same code page as the internal code page, before loading it into anXMLMessage.In this example, if the 4GL client cannot be started up with
–cpinternalUTF–8, but still wants to use 4GL-JMS to pass that UTF–8 document, it can use aBytesMessageor bytes elements in aStreamMessage. When sent as bytes, the XML data will get to the receiver uninterpreted and unconverted. The 4GL receiver can then set the data in aMEMPTRvariable and load the parser.A second option is to convert the text (and the document’s header) to ISO8859–1 using the
CODEPAGE–CONVERT4GL function. However, if-cpinternalrepresents all character, the conversion is automatic if you useLONGCHARorCHAR. If-cpinternalrepresents all characters, the conversion is also automatic when you use the new built-in XML routines (SAX-WRITERorsetX-Document). When you use the new built-in XML routines, you can create, send, and receive UTF-8 XML documents.If the 4GL receiver of an
XMLMessageis unsure about the XML header encoding declaration, it must check it and perhaps modify it to match its internal code page before loading the parser.
|
Copyright © 2005 Progress Software Corporation www.progress.com Voice: (781) 280-4000 Fax: (781) 280-4095 |